12 research outputs found

    Data Repurposing through Compatibility: A Computational Perspective

    Full text link
    Reuse of data in new contexts beyond the purposes for which it was originally collected has contributed to technological innovation and reducing the consent burden on data subjects. One of the legal mechanisms that makes such reuse possible is purpose compatibility assessment. In this paper, I offer an in-depth analysis of this mechanism through a computational lens. I moreover consider what should qualify as repurposing apart from using data for a completely new task, and argue that typical purpose formulations are an impediment to meaningful repurposing. Overall, the paper positions compatibility assessment as a constructive practice beyond an ineffective standard.Comment: To appear in the Special Issue of the Journal of Institutional and Theoretical Economics on "Machine Learning and the Law". Written for the Symposium on Machine Learning and the Law of the Max Planck Institute for Research on Collective Goods: https://www.coll.mpg.de/329557/segovia?c=6765

    Towards Query Logs for Privacy Studies: On Deriving Search Queries from Questions

    Get PDF
    Translating verbose information needs into crisp search queries is a phenomenon that is ubiquitous but hardly understood. Insights into this process could be valuable in several applications, including synthesizing large privacy-friendly query logs from public Web sources which are readily available to the academic research community. In this work, we take a step towards understanding query formulation by tapping into the rich potential of community question answering (CQA) forums. Specifically, we sample natural language (NL) questions spanning diverse themes from the Stack Exchange platform, and conduct a large-scale conversion experiment where crowdworkers submit search queries they would use when looking for equivalent information. We provide a careful analysis of this data, accounting for possible sources of bias during conversion, along with insights into user-specific linguistic patterns and search behaviors. We release a dataset of 7,000 question-query pairs from this study to facilitate further research on query understanding.Comment: ECIR 2020 Short Pape

    Equity of Attention: Amortizing Individual Fairness in Rankings

    Get PDF
    Rankings of people and items are at the heart of selection-making, match-making, and recommender systems, ranging from employment sites to sharing economy platforms. As ranking positions influence the amount of attention the ranked subjects receive, biases in rankings can lead to unfair distribution of opportunities and resources, such as jobs or income. This paper proposes new measures and mechanisms to quantify and mitigate unfairness from a bias inherent to all rankings, namely, the position bias, which leads to disproportionately less attention being paid to low-ranked subjects. Our approach differs from recent fair ranking approaches in two important ways. First, existing works measure unfairness at the level of subject groups while our measures capture unfairness at the level of individual subjects, and as such subsume group unfairness. Second, as no single ranking can achieve individual attention fairness, we propose a novel mechanism that achieves amortized fairness, where attention accumulated across a series of rankings is proportional to accumulated relevance. We formulate the challenge of achieving amortized individual fairness subject to constraints on ranking quality as an online optimization problem and show that it can be solved as an integer linear program. Our experimental evaluation reveals that unfair attention distribution in rankings can be substantial, and demonstrates that our method can improve individual fairness while retaining high ranking quality.Comment: Accepted to SIGIR 201

    Fairness and Bias in Algorithmic Hiring

    Full text link
    Employers are adopting algorithmic hiring technology throughout the recruitment pipeline. Algorithmic fairness is especially applicable in this domain due to its high stakes and structural inequalities. Unfortunately, most work in this space provides partial treatment, often constrained by two competing narratives, optimistically focused on replacing biased recruiter decisions or pessimistically pointing to the automation of discrimination. Whether, and more importantly what types of, algorithmic hiring can be less biased and more beneficial to society than low-tech alternatives currently remains unanswered, to the detriment of trustworthiness. This multidisciplinary survey caters to practitioners and researchers with a balanced and integrated coverage of systems, biases, measures, mitigation strategies, datasets, and legal aspects of algorithmic hiring and fairness. Our work supports a contextualized understanding and governance of this technology by highlighting current opportunities and limitations, providing recommendations for future work to ensure shared benefits for all stakeholders

    Fair Inputs and Fair Outputs: The Incompatibility of Fairness in Privacy and Accuracy

    Get PDF
    Fairness concerns about algorithmic decision-making systems have been mainly focused on the outputs (e.g., the accuracy of a classifier across individuals or groups). However, one may additionally be concerned with fairness in the inputs. In this paper, we propose and formulate two properties regarding the inputs of (features used by) a classifier. In particular, we claim that fair privacy (whether individuals are all asked to reveal the same information) and need-to-know (whether users are only asked for the minimal information required for the task at hand) are desirable properties of a decision system. We explore the interaction between these properties and fairness in the outputs (fair prediction accuracy). We show that for an optimal classifier these three properties are in general incompatible, and we explain what common properties of data make them incompatible. Finally we provide an algorithm to verify if the trade-off between the three properties exists in a given dataset, and use the algorithm to show that this trade-off is common in real data

    Learning to un-rank : quantifying search exposure for users in online communities

    Get PDF
    Search engines in online communities such as Twitter or Facebook not only return matching posts, but also provide links to the profiles of the authors. Thus, when a user appears in the top-k results for a sensitive keyword query, she becomes widely exposed in a sensitive context. The effects of such exposure can result in a serious privacy violation, ranging from embarrassment all the way to becoming a victim of organizational discrimination. In this paper, we propose the first model for quantifying search exposure on the service provider side, casting it into a reverse k-nearest-neighbor problem. Moreover, since a single user can be exposed by a large number of queries, we also devise a learning-to-rank method for identifying the most critical queries and thus making the warnings user-friendly. We develop efficient algorithms, and present experiments with a large number of user profiles from Twitter that demonstrate the practical viability and effectiveness of our framework
    corecore